Use CTRL/CMD + Shift + k to preview your markdown.
Above is the YAML (Yet Another Markup Language) header. It is a human-readable data serialization language. It sets some options for your markdown and gives you a nicely formatted preamble. It is currently set to some of my preferred settings, but feel free to play around with this and make it your own.
We have it set here to output as html, but you can just as easily produce PDF or Word documents. There are a bunch of built-in themes that you can explore here. I quite like the readable theme.
Below is a code chunk. You will notice that this one
does not appear in the rendered document - this is because it is the
setup chunk, and it has the option
include=FALSE set. We use this to set options for chunk
behavior, as well as loading packages and such.
The # above makes a header. A single # is
the largest header, and extra #s are smaller headers.
This header is automatically numbered because of the YAML settings
and the the double #.
Here we will explore some proper code chunks. You can use
CTRL/CMD + ALT + I to create a new chunk. If we want to run
code, but not show it, we can use echo = FALSE in the chunk
options.
Otherwise, our code chunk will be visible. Let’s show off our example regression from the FSCI paper.
lm <- lm(normvalue ~ year + FSCI_region, data = df)
summary(lm)
##
## Call:
## lm(formula = normvalue ~ year + FSCI_region, data = df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -31.370 -8.881 -3.519 6.425 72.022
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 802.36120 99.04397 8.101 8.45e-16
## year -0.39300 0.04928 -7.975 2.29e-15
## FSCI_regionEastern Asia 11.45318 2.49413 4.592 4.61e-06
## FSCI_regionLatin America & Caribbean 0.24044 1.75020 0.137 0.89074
## FSCI_regionNorthern Africa & Western Asia -3.01052 1.81558 -1.658 0.09741
## FSCI_regionNorthern America and Europe -7.85822 2.04028 -3.852 0.00012
## FSCI_regionOceania -0.14376 2.09696 -0.069 0.94535
## FSCI_regionSouth-eastern Asia 2.58542 1.95749 1.321 0.18669
## FSCI_regionSouthern Asia 4.86597 2.03578 2.390 0.01691
## FSCI_regionSub-Saharan Africa 15.54695 1.69547 9.170 < 2e-16
##
## (Intercept) ***
## year ***
## FSCI_regionEastern Asia ***
## FSCI_regionLatin America & Caribbean
## FSCI_regionNorthern Africa & Western Asia .
## FSCI_regionNorthern America and Europe ***
## FSCI_regionOceania
## FSCI_regionSouth-eastern Asia
## FSCI_regionSouthern Asia *
## FSCI_regionSub-Saharan Africa ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 14.93 on 2484 degrees of freedom
## Multiple R-squared: 0.2395, Adjusted R-squared: 0.2368
## F-statistic: 86.93 on 9 and 2484 DF, p-value: < 2.2e-16
This shows our code and output, but it is not terribly clean looking.
To get a cleaner regression output, we can convert our regression
output to a data frame, then use knitr::kable() to create a
nice looking table.
lm_df <- broom::tidy(lm)
knitr::kable(lm_df)
| term | estimate | std.error | statistic | p.value |
|---|---|---|---|---|
| (Intercept) | 802.3611950 | 99.0439650 | 8.1010609 | 0.0000000 |
| year | -0.3930000 | 0.0492762 | -7.9754489 | 0.0000000 |
| FSCI_regionEastern Asia | 11.4531769 | 2.4941272 | 4.5920580 | 0.0000046 |
| FSCI_regionLatin America & Caribbean | 0.2404370 | 1.7502015 | 0.1373768 | 0.8907441 |
| FSCI_regionNorthern Africa & Western Asia | -3.0105244 | 1.8155793 | -1.6581619 | 0.0974110 |
| FSCI_regionNorthern America and Europe | -7.8582226 | 2.0402820 | -3.8515374 | 0.0001203 |
| FSCI_regionOceania | -0.1437598 | 2.0969580 | -0.0685563 | 0.9453483 |
| FSCI_regionSouth-eastern Asia | 2.5854171 | 1.9574857 | 1.3207847 | 0.1866948 |
| FSCI_regionSouthern Asia | 4.8659655 | 2.0357824 | 2.3902188 | 0.0169124 |
| FSCI_regionSub-Saharan Africa | 15.5469469 | 1.6954749 | 9.1696708 | 0.0000000 |
Extra steps to get the column names capitalized and the numbers rounded:
lm_df_cleaner <- lm_df %>%
mutate(across(where(is.numeric), ~ round(.x, 3))) %>%
setNames(c(snakecase::to_title_case(names(.))))
knitr::kable(lm_df_cleaner)
| Term | Estimate | Std Error | Statistic | P Value |
|---|---|---|---|---|
| (Intercept) | 802.361 | 99.044 | 8.101 | 0.000 |
| year | -0.393 | 0.049 | -7.975 | 0.000 |
| FSCI_regionEastern Asia | 11.453 | 2.494 | 4.592 | 0.000 |
| FSCI_regionLatin America & Caribbean | 0.240 | 1.750 | 0.137 | 0.891 |
| FSCI_regionNorthern Africa & Western Asia | -3.011 | 1.816 | -1.658 | 0.097 |
| FSCI_regionNorthern America and Europe | -7.858 | 2.040 | -3.852 | 0.000 |
| FSCI_regionOceania | -0.144 | 2.097 | -0.069 | 0.945 |
| FSCI_regionSouth-eastern Asia | 2.585 | 1.957 | 1.321 | 0.187 |
| FSCI_regionSouthern Asia | 4.866 | 2.036 | 2.390 | 0.017 |
| FSCI_regionSub-Saharan Africa | 15.547 | 1.695 | 9.170 | 0.000 |
For a very clean plot with less work, try the sjPlot
package:
sjPlot::tab_model(
lm,
p.style = 'stars',
digits = 2,
show.se = TRUE
)
| normvalue | |||
|---|---|---|---|
| Predictors | Estimates | std. Error | CI |
| (Intercept) | 802.36 *** | 99.04 | 608.14 – 996.58 |
| year | -0.39 *** | 0.05 | -0.49 – -0.30 |
|
FSCI region [Eastern Asia] |
11.45 *** | 2.49 | 6.56 – 16.34 |
|
FSCI region [Latin America & Caribbean] |
0.24 | 1.75 | -3.19 – 3.67 |
|
FSCI region [Northern Africa & Western Asia] |
-3.01 | 1.82 | -6.57 – 0.55 |
|
FSCI region [Northern America and Europe] |
-7.86 *** | 2.04 | -11.86 – -3.86 |
| FSCI region [Oceania] | -0.14 | 2.10 | -4.26 – 3.97 |
|
FSCI region [South-eastern Asia] |
2.59 | 1.96 | -1.25 – 6.42 |
|
FSCI region [Southern Asia] |
4.87 * | 2.04 | 0.87 – 8.86 |
|
FSCI region [Sub-Saharan Africa] |
15.55 *** | 1.70 | 12.22 – 18.87 |
| Observations | 2494 | ||
| R2 / R2 adjusted | 0.240 / 0.237 | ||
|
|||
Check out the documentation here. This is where you really learn how to use a package. It is written by the author, with abundant vignettes and examples.
We’ve already seen how to make tables above. For static tables,
knitr::kable() is a good choice. The
kableExtra package is also a great extension to knitr,
giving you tons of options for customization. See the docs
for examples.
For interactive tables, the DT package is a great
option, but my personal favorite is reactable. The documentation is
excellent, so check it out if you’re interested.
Note that we are setting echo=FALSE here, so the code
chunk will not show up.
We haven’t really covered plots, but you really just throw your code in the chunk and it will appear.
gapminder %>%
filter(year == 2007) %>%
ggplot(aes(x = gdpPercap, y = lifeExp, color = continent, size = pop)) +
geom_point() +
theme_classic() +
labs(
x = 'GDP per Capita',
y = 'Life Expectancy',
title = 'Life Expectancy against GDP per Capita'
)
What about an interactive plot?
plot <- gapminder %>%
filter(year == 2007) %>%
ggplot(aes(
x = gdpPercap,
y = lifeExp,
color = continent,
size = pop,
text = paste0(
'Country: ', country, '\n',
'Continent: ', continent, '\n',
'Life Exp: ', lifeExp, '\n',
'Population: ', pop
)
)) +
geom_point() +
theme_classic() +
labs(
x = 'GDP per Capita',
y = 'Life Expectancy',
title = 'Life Expectancy against GDP per Capita'
)
ggplotly(plot, tooltip = 'text')